HOME/Articles/

2017-5-16 从字典中提取子集

Article Outline

需求

给你一个条件,从已给字典中构造一个符合条件的新字典,为原字典的子集。

解决方案

使用字典推导式:

prices = {
    'ACME': 45.23,
    'AAPL': 612.78,
    'IBM': 205.55,
    'HPQ': 37.20,
    'FB': 10.75
}
# Make a dictionary of all prices over 200
p1 = {key: value for key, value in prices.items() if value > 200}
# Make a dictionary of tech stocks
tech_names = {'AAPL', 'IBM', 'HPQ', 'MSFT'}
p2 = {key: value for key, value in prices.items() if key in tech_names}

<!--more-->

案例

# 需求:输出phone_bill中每月不为'0.00'的项目
"phone_bill" : [
        {
            "bill_zengzhifei" : "0.00",
            "bill_qita" : "0.00",
            "bill_package" : "46.00",
            "bill_ext_sms" : "0.20",
            "bill_daishoufei" : "0.00",
            "bill_ext_data" : "0.00",
            "bill_ext_calls" : "0.00"
        },
        {
            "bill_zengzhifei" : "0.00",
            "bill_qita" : "0.00",
            "bill_package" : "46.00",
            "bill_ext_sms" : "0.60",
            "bill_daishoufei" : "0.00",
            "bill_ext_data" : "15.62",
            "bill_ext_calls" : "4.18"
        },
        {
            "bill_zengzhifei" : "0.00",
            "bill_qita" : "0.00",
            "bill_package" : "56.00",
            "bill_ext_sms" : "0.30",
            "bill_daishoufei" : "0.00",
            "bill_ext_data" : "9.36",
            "bill_ext_calls" : "7.03"
        },
        {
            "bill_zengzhifei" : "0.00",
            "bill_qita" : "0.00",
            "bill_package" : "46.00",
            "bill_ext_sms" : "0.30",
            "bill_daishoufei" : "0.00",
            "bill_ext_data" : "0.00",
            "bill_ext_calls" : "0.00"
        },
        {
            "bill_zengzhifei" : "0.00",
            "bill_qita" : "0.00",
            "bill_package" : "10.58",
            "bill_ext_sms" : "0.00",
            "bill_daishoufei" : "0.00",
            "bill_ext_data" : "0.00",
            "bill_ext_calls" : "0.00"
        }
    ]

# 处理逻辑
for cursor in xrange(0,5):
     phone_bill_tmp = {key: value[:-1] for key, value in phone_bill[cursor].items() if value not in ['','0.00']}
     phone_bill.append(phone_bill_tmp)

# 结果
"phone_bill" : [
        {
            "bill_package" : "46.0",
            "bill_ext_sms" : "0.2"
        },
        {
            "bill_package" : "46.0",
            "bill_ext_sms" : "0.6",
            "bill_ext_data" : "15.6",
            "bill_ext_calls" : "4.1"
        },
        {
            "bill_package" : "56.0",
            "bill_ext_sms" : "0.3",
            "bill_ext_data" : "9.3",
            "bill_ext_calls" : "7.0"
        },
        {
            "bill_package" : "46.0",
            "bill_ext_sms" : "0.3"
        },
        {
            "bill_package" : "10.5"
        }
    ]

思考

大多数情况下字典推导能做到的,通过创建一个元组序列然后把它传给 dict() 函数也能实现。比如:

p1 = dict((key, value) for key, value in prices.items() if value > 200)

但是,字典推导方式表意更清晰,并且实际上也会运行的更快些 (在这个p1中,实际测试几乎比 dcit() 函数方式快整整一倍)。

有时候完成同一件事会有多种方式。比如,p2程序也可以像这样重写:

# Make a dictionary of tech stocks
tech_names = { 'AAPL', 'IBM', 'HPQ', 'MSFT' }
p2 = { key:prices[key] for key in prices.keys() & tech_names }

但是,运行时间测试结果显示这种方案大概比第一种方案慢1.6倍。

所以,完成一个需求,方案并不是唯一的,也并没有最完美的,只有更好的解决方案,如果对程序运行性能要求比较高的话,这就需要花点时间去做计时测试了。